Image Segmentation with ENet¶

With the rise of AI and Machine Learning, new doors have been opened in the automotive industry for self driving vehicals. These vehicals use AI models to interperate their surroundings so that they can make the correct decisions to get you from A to B with everyone in one piece. A good, yet simple example of one of these models is ENet Cityscapes. This model does what is called Image Segmentation, which splits a photo into several regions based on what the model thinks is in them. Here is an example:

68747470733a2f2f696d672e796f75747562652e636f6d2f76692f48625068766374356b76732f302e6a7067.jpg

Hardware Needed:¶

All you are going to need in this notebook is an Ultra-Sonic Sensor and a Grove PMOD Adapter.

ultrasonic-sensor.png

pmod.png

Useful Term Cheat Sheet:¶

Here is a cheat sheet of some usefull terms that may pop up in this notebook. As always, feel free to ask me (Ben) any questions!

Term Description
DPU Data Processing Unit
ENet This is the segmentation model we are using!
Segmentation Splitting a photo into different areas based on what is in them.
Overlay Overlays are designs for the FPGA
FPGA Field Programmable Gate Array. This is a special kind of chip that we can reprogram to do many different tasks very efficiently!
Array Another word for list. Multiple variables or items stored under one variable name.
Tensor A multidimentional array also known as a matrix. (arrays within arrays)

Importing Libraries:¶

In [ ]:
# Lets import some libraries we will need later on:
import os                                               # The os library provides some functions for interacting with your operating system (your computer system)
import time                                             # The time library provides some time related functions like the wait() command
import numpy as np                                      # numpy or np is a library for processing numerical data. Its mainly just math stuff. 
import cv2                                              # cv2 is a library for image processing which we need later
from PIL import Image                                   # PIL Image adds some image processing tools too
import matplotlib.pyplot as plt                         # matplotlib.pyplot is a great tool for displaying data on graphs or other media 
%matplotlib inline
from pynq_dpu import DpuOverlay                         


#Now lets define some variables for later and get the DPU ready:
image_folder = 'img'                                    # Specifies which folder to look in for photos                                            

Overlay Setup:¶

Now lets define our overlay and load the Enet model onto it.

In [ ]:
overlay = DpuOverlay("dpu.bit")                
overlay.load_model("pt_ENet.xmodel")

Creating an image list:¶

This cell creates a list of images we intend to use as inputs. It then prints how many images it finds.

In [ ]:
#Here we are creating a list that contains all of the .jpeg photos in the image folder we specified earlier
original_images = [i for i in os.listdir(image_folder) if i.endswith("png")]
total_images = len(original_images)
print("Number of Photos:" + str(total_images))

Preprocessing Setup:

Here we are going to define a few lists and a function that will be needed for preproccessing. The model cannot take an image as an input directly, so these lists and functions are used to turn the photo into inputs the model can understand.

In [ ]:
# The pallet list provides the colors to be used in our output photo later
pallete = [128, 64, 128, 244, 35, 232, 70, 70, 70, 102, 102, 156, 190, 153, 153, 153, 153, 153, 250, 170, 30,
           220, 220, 0, 107, 142, 35, 152, 251, 152, 70, 130, 180, 220, 20, 60, 255, 0, 0, 0, 0, 142, 0, 0, 70,
           0, 60, 100, 0, 80, 100, 0, 0, 230, 119, 11, 32 ]

# These are some preprocessing variables and functions that help the model interperate the input photo
# Dont worry if this looks like gibberish to you! This is beyond what you need to know for this camp.
MEANS = [.485, .456, .406]
STDS = [.229, .224, .225]

def preprocess_fn(image):
    image = image.astype(np.float32)
    image =  image / 255.0
    for j in range(3):
        image[:, :, j] -= MEANS[j]
    for j in range(3):
        image[:, :, j] /= STDS[j]
    return image

Initializing the DPU: The runner is how we pass tensors through to the model. Lets initialize it here!

In [ ]:
dpu = overlay.runner

Tensors:

Tensors are how we pass information in and out of models. You can think of tensors as arrays of arrays of arrays ... or a "multidimentional array" if you want to sound cool. (Arrays are another word for lists!)

Here we are looking to find what the dimensions of the input and output tensors are so we know how to shape our data.

In [ ]:
# Here we assign the input and output tensors to variables we can use later.
inputTensors = dpu.get_input_tensors()
outputTensors = dpu.get_output_tensors()

shapeIn = tuple(inputTensors[0].dims)   # (1, 512, 1024, 3)        
shapeOut = tuple(outputTensors[0].dims) # (1, 512, 1024, 19)
# outputSize = int(outputTensors[0].get_data_size() / shapeIn[0]) # While you dont need this, you can find out the size of your output too!

Shaping the data:

Here we want to shape our data to the tensor dimentions we found in the last cell. We do this through a library called numpy which is really good for numerical computing.

In [ ]:
input_data = [np.empty(shapeIn, dtype=np.float32, order="C")]
output_data = [np.empty(shapeOut, dtype=np.float32, order="C")]
image = input_data[0]

Where all the fun happens!

Here we define our run function which takes our image and segments it. This uses all of the bits and pieces we defined earlier!

In [ ]:
def run(image_index, display=False):
    # Read input image
    input_image = cv2.imread(os.path.join(image_folder, original_images[image_index]))
    
    # Pre-processing
    resized = cv2.resize(input_image,(1024,512))
    preprocessed = preprocess_fn(resized)
    
    # Fetch data to DPU and trigger it
    image[0,...] = preprocessed.reshape(shapeIn[1:])
    job_id = dpu.execute_async(input_data, output_data)
    dpu.wait(job_id)                                       
    
    # Retrieve output data
    classMap_numpy = np.argmax(output_data[0][0], axis=-1).astype(np.uint8)
    classMap_numpy = Image.fromarray(classMap_numpy)
    classMap_numpy_color = classMap_numpy.copy()
    classMap_numpy_color.putpalette(pallete)
    if display:
        _, ax = plt.subplots(1) # Display segmented Image
        _ = ax.imshow(classMap_numpy_color)
        
        
        display_image = cv2.imread(os.path.join(    # Display original image
            image_folder, original_images[image_index]))
        _, ax = plt.subplots(1)
        _ = ax.imshow(cv2.cvtColor(display_image, cv2.COLOR_BGR2RGB))

Go time!

Lets finally run this thing by passing in one of our photos from the origional_images list!

In [ ]:
run(0, display=True) # Note the first arguement in the run function takes the 
                     # index of the photo you want, not the photo itself.

But Wait! There's More!

We can use a simple for loop to do this for all of the images we found earlier.

In [ ]:
for img in original_images:
    run(original_images.index(img), display=True) 

Cleanup: Just like in the other notebooks, we have to release the overlay and dpu so we can use it later!

In [ ]:
del overlay
del dpu

Distance Measurement with Ultrasonic Sensors¶

Setup: Attatch the PMOD adapter to your board and then attatch the Ultrasonic Sensor to the G4 port. This is what it should look like:

Ultrasonic%20connector.jpg

Another important aspect of automated driving is being able to sense how far ahead of you an object is. While automated cars use something called LIDAR to range objects, we will be using ultrasonic sensors. These send out sound waves well above the range that we can hear. It then uses how long it takes for that sound to bounce off an object and return to the sensor to calculate its distance. Here is some simple code to take 10 distance samples:

Adapter and Device Setup

In [ ]:
from pynq_peripherals import PmodGroveAdapter
from kv260 import BaseOverlay

base = BaseOverlay('base.bit')

adapter = PmodGroveAdapter(base.PMODA, G4='grove_usranger')
usranger = adapter.G4

Because our ultrasonic sensor is pretty small, it can only detect ranges up to 5 meters. Anything past that and it will print 500 cm!

Here's a simple loop that prints the measured distance 10 times:

In [ ]:
from time import sleep
for i in range(10):
    print('distance: {:.2f} cm'.format(usranger.get_distance()))
    sleep(1)
In [ ]:
del base

Extra Challenges:¶

Was this too fast or boring? Here are some challenges for you!

Easy: Change the code to add the .jpeg file in the img directory to the original_images list.

Medium: Change the run function to accept images instead of photo indexes. (i.e. origional_images[0] instead of original_images.index(img))

Hard: Detect and print what the color is at the very center of the photo.

Do you just want my job???: Detect the color at the center of the screen and print "CAR IN FRONT" on the OLED display if the color is blue.